coronavirus genome
AI mathematician, tumour fungi and Africa's coronavirus genomes
AlphaTensor was designed to perform matrix multiplications, but the same approach could be used to tackle other mathematical challenges.Credit: DeepMind An artificial intelligence (AI) developed by machine-learning company DeepMind in London has tackled a type of calculation called matrix multiplication. The system -- called AlphaTensor -- leverages the skills that DeepMind's game-playing AIs use to beat human players at games such as Go and chess. Matrix multiplication is a widely used mathematical technique that involves multiplying numbers arranged in grids, or matrices, that might represent sets of pixels in images, air conditions in a weather model or the internal workings of an artificial neural network. AlphaTensor broke ground by finding shortcuts to solve these problems with fewer steps (A. The same general approach could have applications in other kinds of mathematical operation, its developers say, such as decomposing complex waves or other mathematical objects into simpler ones.
Application of Markov Structure of Genomes to Outlier Identification and Read Classification
Karr, Alan F., Hauzel, Jason, Porter, Adam A., Schaefer, Marcel
That the sequential structure of genomes is important has been known since the discovery of DNA. In this paper we employ a statistics and stochastic process perspective on triplets of successive bases to address two important applications: identifying outliers in genome databases, and classifying reads in the metagenomic context of reference-guided assembly. From this stochastic process perspective, triplets are a second-order Markov chain specified by the distribution of each base conditional on its two immediate predecessors. To be sure, studying genomes via base sequence distributions is not novel. Previous papers have addressed genome signatures (Karlin et al., 1997; Campbell et al., 1999; Takashi et al., 2003), as well as frequentist (Rosen et al., 2008) and Bayesian (Wang et al., 2007) approaches to classification problems.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
Measuring Quality of DNA Sequence Data via Degradation
Karr, Alan F., Hauzel, Jason, Porter, Adam A., Schaefer, Marcel
As public genome databases proliferate, their immense scientific power is tempered by skepticism about their quality. The skepticism is not merely anecdotal: there are documented instances and implications (Commichaux et al., 2021; Langdon, 2014; Steinegger and Salzberg, 2020). Although we argue in Appendix A that data quality should not be construed as comprising only errors in data, the principal contribution of the paper is a novel paradigm for measuring quality of genome sequences by deliberately introducing errors that reduce quality, a process we term degradation. The errors are single nucleotide polymorphisms (SNPs), insertions and deletions that both occur naturally as mutations and arise in next generation sequencing. Our reasoning is that higher quality data are more fragile: the higher the initial quality, the greater the effect of the same amount of degradation.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York (0.05)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (2 more...)